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0. Introduction 



Given an alphabet S = {1,2,3, ... ,d} on d symbols the full shift is the dynamical system (5^, a) . 
Here a is the left shift: (7{x)i = a;,+i, Vi G Z. As the full shifts are rather simple objects the main 
study in topological dynamics usually concentrates on subshifts of finite type (SoFT), where one 
further restricts the sequences by listing all allowed neighboring symbols. This leads to the transfer 
matrix formulation which for 1-dimensional sequences has a pretty complete theory. 

If the symbol interaction is not nearest neighbor but finite range, by using a larger alphabet 
(from blocks of ^-symbols) one can still remain in the SoFT setup. For infinite range interactions 
this is not anymore possible and the transfer matrix formalism breaks down. 

Suppose that we impose an infinite restriction, an exclusion rule, and define: 

(0.1) = {xG S^\ Xi + i G Z, n G N} , 

where / : N N, (N = {1, 2, 3, . . .}) is strictly increasing hence also unbounded. Note that the 
definition implies a symmetric rule (w.r. to any ?'). One could interpret this as some kind of exclusion 
principle: symbols on "shells" of radius / repel their kind. Mike Keane proposed a model like this, 
with the particular choice /(n) = r? ([LM]). As a first question he asked for which S the space is 
nonempty. Turns out that even for the simplest choices of the jump sequence / answering this is 
fairly involved. Note that in symbolic dynamics context it would be most natural to expect a finite 
alphabet, d < oo. 

Alternatively one could view Z as a graph where there is an edge between two distinct ver- 
tices/integers i and j iff |? — i| = fin) for some n G N. Coloring the graph by requiring that adjacent 
vertices have different colors leads to the natural question of the chromatic number of this (undi- 
rected) graph. This formulation seems to be due to Paul Erdos who was curious about the / for 
which the chromatic number is finite. 

In this paper we will first present some examples and then review the known results. Through 
these the problem is then connected to more general dynamics and to additive number theory. It 
turns out that in some sense for most combinations of d and / the shift spaces are empty. This 
will lead to us to consider how the termination of the sequences comes about. Analysis of these 
generation algorithms is the main content of the latter half of the paper. 



1. Infinite sequences 
1.1. First observations 

Together with the two-sided sequences of (0.1) we also deal with the one-sided version X^"^ where 
the rule is restricted to . If we can show that X^^ j,^ is small/empty then by X(^dj^\^ c j.^ 
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(the restriction to positive integers) the same holds for the two-sided space. The inclusion holds 
since in the past haunts only back to the coordinate whereas in -'^(d,/) the block can of 

course arise from a symbol arbitrarily far in the past. A second observation is even more basic but 
we state it nevertheless: -'^(d,/) C -'^(d'/) if < d' . This holds for one-sided sequences, too. If we can 
show the smallness for some alphabet size, it holds for all smaller alphabets. 

Let us first consider some explicit cases to gain some insight into the contribution of the alpha- 
bet size and growth rate of the jump sequence. By lexicographically generating the one-sided 
sequences we mean that one proceeds from xi = 1 rightwards to the nearest coordinate always 
assigning the smallest allowed symbol. 

Example 1.1. (linear /): Consider the case S = {1,2} and /(n) = 2n. Clearly if xq = 1 then 
X2k — 2,\/k ^ but X2 = 2 implies X2m = l,Vm ^ 1, a contradiction. So X(^2,2n) = 0- 

On the other hand -^^(2,2™-!) 0- it consists of the two periodic points (12)*. 

It is worth noticing that the emptiness is not due to too small S. Let /(n) = kn, k > 2, fixed 
integer. If Xq = 1, then Xkn = "iliVn > 1 (-is meaning "any symbol but s"). If then Xk = 2, we 
have Xik ^ {1;2}, VZ > 2. Continuing like this considering symbol choices at coordinates that are 
multiples of k one exhausts any finite alphabet i.e. X(^d,kn) = for any finite d. 

Example 1.2. (rf = 2 and power /): Suppose S = {1, 2} and /(n) = , r = 2, 3, . . . Let xq = 1 
then X2i = 1, Vi e Z so in particular X2r = 1 which generates a contradiction. Hence X^2,n^) = 0- 
Here the minimal size of S is crucial: it strongly forces parity. 

Example 1.3. (fast growing /): Suppose there is a natural number m which does not divide 
f{n) for any n = 1,2,... (subsequently we use the notation a\b meaning a divides b). Clearly for 
e.g. , n^,n\ . . . this isn't the case, but has this property for prime p. For d > m one can then 
have infinite periodic points. Hence for example 2") 7^ (at least 6 elements, (123)* and its 
permutations). 

It is straightforward to see that X^^^,-^ = at least for d = 2,3. One simply sets xi = 1, lays 
down the blocks from it, sets Xi = 2 and then the blocks that it generates etc. With d = 2 one is 
stuck in the second assignment round, with d = 3 in the sixth. 

"^(4 ri!) ^® already considerably more interesting. With computer one can sec that lexicograph- 
ically generated, the sequence quickly becomes periodic with period (1234123412312341231234123) 
(of length 25). However this period cannot persist indefinitely since starting with a;i = 1 we would 
have xr] = 1. The sequence does not terminate then though but instead there is a short transient 
after which the original period is reinstated (in the lexicographic generation of the sequence). Since 
25 |n!, Vn > 10 wc sec that this periodicity would be interrupted infinitely often. Is there an infinite 
sequence in X^^^,^7 In computer runs one with the given period will persist at least for 500.000 
steps. 

The existence or non-existence of a period leads naturally to a language theoretic characterization 
of these spaces. For a background on these we refer to [HU]. Let \s\ denote the length of the 
string/sequence s. 
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Theorem 1.1.: If for any m e N there is n G N such that we have m | / (n) then the words satisfying 
the cxchision rule of (0.1) do not form a context-free language. Hence the strings do not form a 
regular language (sofic shift) either. 

Proof: A context-free language necessarily satisfies a Pumping lemma: any sufficiently long string 
s, say |s| > k, can be written as s = uvxyz such that (i) \vxy\ < k, (ii) \vy\ > 1 and (iii) uv"xy"z 
is an allowed string for all n S N. Here (ii) means that the "pumping" is indeed nontrivial action. 
If min{|z;|, \y\} — but (ii) still holds, this more general Pumping lemma for context free languages 
is reduced to that for regular languages. 

In our context the validity of a Pumping lemma would mean that there exists some period 
string that allows arbitrarily long repetitions always resulting in a legal string. But if this period 
is of length m and m|/(n) for some n € N then e.g. the first symbol in the block contradicts the 
identical symbol in the (/(n)/m)*^ block. I 

Remarks: The result ties the divisibility properties of of {/('^)}„gN t° complexity of the 
sequences: if non- divisibility prevails periodic sequences exist, if not then the recipe for constructing 
infinite sequences must be more complicated. Examples like ^(3,2") or -^(^ primes) containing periodic 
points arc rare specialities. But the latter half of Example 1.3. suggests that forbidden periodicity 
is likely just part of the story. 



1.2. Lacunary case 

Considering one-sized sequences it is clear that for a finite d a faster growing / allows more freedom 
for the sequence generation. In particular P. Erdos seems to have formulated in the mid 80's 
the question whether lacunary sequences are sufficiently fast growing for infinite sequences/finite 
chromatic number to prevail. Recall that {xn} is lacunary if there is e > such that x„+i/xn > 
1 -h e, Vn. This problem was solved by Y. Katznelson and published later in [K]: 

Theorem 1.2.: For any lacunary f there is a finite d such that X(^cl,f) 0- 

Remarks 1. This is the sequence translation of the original solution which was in terms the chro- 
matic number. The solution connected the problem to another one, in Diophantine approximation 
(also by Erdos), and solved them using a novel dynamcial system formulation. 
2. Katznelson didn't provide an explicit bound for the chromatic number/number of symbols needed 
although one was implicit in his work. Other have later improved the boimd the best being now 
l + c| loge|/e where e is as in the definition of lacunarity and c an absolute constant ([PS]). Note that 
while of course useful this still doesn't resolve explicit cases like Example 1.3. i.e. whether -^(4 
is non-empty. 

It is natural to connect this to recurrence as follows (see [W] for more detail). Given {d, /) let X be 
the subset (0.1) and p the natural metric it inherits from the full shift. 
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Definition 1.3.: C N is a non-recurrence set for the (topological) dynamical system {X, a, p) 
if for some positive b and all x £ X and d G D it holds that p (a'^x, x) > b. 

It is easy to show that the finiteness of the chromatic number is equivalent to D being a non- 
recurrence set ([W]). So in particular Theorem 1.2. shows that a lacunary D is a set of non- 
recurrence. Not much seems to be currently known about non-lacunary non-recurrence sets - very 
few examples arc known. 

For subsequent use will be to define D a set of recurrence if X has infinite chromatic number. 

1.3. Sublacunary realm 
1.3.1. Dynamics approach 

We now chart a bit what is known about the sublacunary case. The resulting map will unfortunately 
be somewhat patchy, perhaps indication of the fact that not all right concepts are yet known. Clearly 
recurrence sets and non-recurrence sets partition the space of sequences but unfortunately there 
doesn't seem to be a definitive criterion for the recurrence sets. Within them most attention has 
been paid to Poincare sequences. 

Definition 1.4.: A sequence Z) G N is a Poincare sequence if for a (measurable) dynamical 
system {Y, B, p., T) and any B e B, p{B) > it holds that p (T-'^'^B n B) > for some dk E D. 

Remarks: 1. The Poincare Theorem concerns the case £) = N, showing this D to be a set of 
recurrence. 

2. An ingenious example by I. Kriz shows that Poincare sequences are not all of recurrence sets. 

As the Definition above is not explicit, other forms have been searched for. For a subset of integers 
A, let the difference sethe A- A = {a- a'\ a, a' e A} and let A(^) = A n {1, 2, 3, . . . , N}. By 

the upper density of A we mean limsup^^o^ n Betrand-Mathis established the equivalence: 
D is a Poincare sequence if and only \i Di^{A — A) ^0 for all A of positive upper density ([B-M]). 
Due to this formulation, in the literature such D is also called an intersective sequence. 

The developments of the subject gained definitive momentum when Lazlo Lovasz conjectured (ap- 
parently just verbally) that if one insists that ^ A — Aior all natural numbers n then A cannot 
have positive density. In the late seventies Furstenberg and Sarkozy separately managed to prove 
(in [F] and [S]): 

Theorem 1.5.: Given 5 > there is Nq{6) such that if N > No{6) and \ A^^^ \ > SN then there is 
natural n such that G A — A. 

Remark: The proofs were of diff'erent character; Furstenberg's ergodic theoretic, Sarkozy's utilized 
Fourier analytic techniques enabling quantitative estimates for the density. This will have some 
implications in our further work. 
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Theorem 1.5. together with the Betrand-Mathis -equivalence shows that {n^} is an example of 
a Poincare/intersective sequence. Later on more examples of such sequences have been worked 
out by Furstcnbcrg and many others. We are not going to list these but just highlight that e.g. 
all monomials n*^ and more generally polynomials satisfying the hypothesis of Theorem 1.1. are 
interscctivc (the latter result is due to Kamac and Mcndcs France, [KMF]). 

While Betrand-Mathis alternative is conceptually valuable, a criterion often more usable is 

Lemma 1.6.: If for all a € (0, 27r) lim^^oo ^ Sfc=i e*"**" then {sk} is a Poincare sequence. 

Remark: These are called Weyl sums due to his first use of them to show uniform distribution 
(for integers). 

The results above have direct bearing on our original problem. Consider for simplicity its 1-sided 
version with d symbols and / an intersective polynomial. Let Ai = {j\ Xj = i}, these are just the 
sets of coordinates where the i**^ symbol is found. Suppose now that Ai, i = 1, . . . ,d partition N 
i.e. the 1-sided sequence is uniquely defined everywhere on N. Then Theorem 1.5. generalized as 
indicated above implies that if the exclusion is to hold for this sequence, for sufficiently large N the 
upper densities of A,'s cannot add up to 1, a contradiction. Hence 

Theorem 1.7.: For intersective polynomials f the spaces X^^ and therefore all -^(d,/) are empty. 

To gain some insight on what can be done from first principles and more importantly, how the 
sequence termination takes place, let us take for a moment a purely elementary view. 

Example 1.4. {d = 3,4 and n^): To directly resolve the case d = 3 one can utilize simple 
identities like 3^ -f 4^ = 5^: instead of dealing with all the entries upto a given distance left and 
right of origin, one only needs to check a rather thin subset of them. Let xq = 1 and generate its 
blocks on {—25, . . . , 25}. Then set xie = 2 and generate the blocks from it. In third iteration one 
sets X-g = 3 (or X25 = 3 just as well) since there is a double block at it already due to the given 
identity. Now the full alphabet is in use. If e.g. at the doubly blocked coordinate 7 one sets X7 = 1 
the following (fourth) assignment iterate succeeds but next one results in a full block at several sites 
on {-25, . . . , 25}. Hence X(3_„2) = 0. 

This procedure of trying to generate a contradiction in the construction of a two-sided sequences 
gets rather unwieldy for larger S even for this /. Suspecting the spaces might be empty one could 
switch instead to considering one-sided sequences i.e. aiming to show that X'^^ = 0- Indeed here 
one succeeds with a computer assisted proof (CAP). Checking all these sequences systematically 
is a manageable task and turns out they are finite, the longest being of length 47. Xj^^aj- is 
exponentially harder and CAP on a desktop machine does not seem to terminate (sequences of 
length about 170 exist). 

The problem gets quickly rather intractable for higher monomials n^, r > 3 one reason being 
that there are no simplifying identities like = n^+m^ available then due to Fermat's theorem. For 
r = 3 a CAP program finds one-sided sequences of length at least 300. Note that by the argument 



6 



in Theorem 1.1. we immediately know that there are no periodic sequences in any of the spaces 

1.3.2. Additive combinatorics 

Sarkozy's result inspired plenty of subsequent additive number theory. Some of these results shed 
light to our original problem. They were often parallel results to those in the previous section, while 
simultaneously being quantitative. We will now briefly highlight this development. 

Sarkozy's proof of Theorem 1.5. utilized the Hardy - Littlewood Circle Method. By honing it further 
Balog, Pelikan, Pintz and Szemeredi established in 1994 

Theorem 1.8.: Fix a natural k > 2. If n'' ^ A — A for all n then 

N (log A/')c log log log log AT' 

Here <C means "less than constant times" . Best c has been worked out to be 1/ log 3. This in turn has 
been later been extended to intersective polynomials / e Z[a;] perhaps culminating to J. Lucier's 
result in 2006 ([L]): 

Theorem 1.9.: Suppose f G Z[x] is an intersective polynomial of degree k >2 with positive leading 
term. If f{n) ^ A- A for all n with f{n) > then 

|^(^)| / (loglogjV)n ^ 
N V log^ / 

for ji = 2, if k = 2, jj, = 2 if k > 2 and the constant only depends on f. 

Like in the dynamics approach, from these results one can readily deduce Theorem 1.7. Additionally 
here one gets a upper bound for the instant when the "total density" of the d subsequences drops 
below 1 i.e. the contradiction takes place. But as one can see from the formulas the bound will be 
huge. This is due to the method of proof, not the phenomenon at hand. This absence of reasonable 
bounds will motivate us to investigate in the next section how long sequences satisfying (0.1) in the 
sublacunary realm can actually prevail. 

One could of course argue that in the sublacunary case there is nothing special about in polynomials. 
Perhaps so, in which case the findings above are just to indicate explicit cases, where the original 
problem can be solved. However there is a reason to believe that Theorem 1.7 is generic in the sense 
that for all sublacunary rates of growth, there are / such that the chromatic number is inflnite i.e. 
is empty for all d. (see [AHK]). 
Our knowledge of the sublacunary realm is hence left in a somewhat curious state. We do not 
know whether for all sublacunary growth rates there are non-empty -'^(d,/) coexisting with the empty 
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ones. Our Example 1.1. ({2n} is recurrent, {2n — 1} in non-recurrent) shows that this is possible 

for at least some rate. Is there an upper bound for the growth rate below which such coexistence is 
possible? Perhaps the inevitable conclusion from this state of affairs is that the growth rate is just 
too crude a measure for the problem. One should use something (explicit!) that genuinely captures 
the recurrence-properties of the /-sequence. 

2. Finite sequences 
2.1. Random generation 

We now proceed to the random generation of the simpler one-sided subset of sequences compatible 
with the exclusion (0.1) and the analysis of the outcome of this algorithm. The main results here are 
aimed at showing that a probabilistic model gives surprisingly accurate information on the length 
distribution. While the methods are applicable to a wide variety of jump sequences / = is the 
only one that we give a statistical assessment. 

Our algorithm generates random sequences obeying the restriction (0.1) on . Each coordinate is 
chosen independently and uniformly but in such a way as to respect the restrictions from all the 
coordinates on its left. 

Algorithm v2.0: 

0. set M > 1, let Sj = 5 at each j £ {1, . . . , M} and set i = 1. 

1 . if = then halt , 

else pick uniformly a random symbol s G Si. 

2 . update Sj Sj \ {s} for all j = i + f{n) + M), n e N. 

3 . if i = M halt and call full length, 
else i •(— i + 1 eind go to 1 . 

So just halting means that the random procedure did not result in a legal string of symbols of length 
M but produced a contradiction earlier. "Full length" means that length M was reached and the 
sequence needs to be continued with larger M to further address its viability. 

One might perhaps also want to think the the sequence {si} as a random walk in a random 
environment (BWRE). Since the walker itself generates the future obstacles one could call the walk 
in this sense self-similar. 

Definition 2.1.: Given a site j >2 call the set of sites Dj = {1, 2, . . . , j — 1} n {fc | k = j — f{n) > 
1, n G N} the dragnet of j and let Di = 0. The cardinality of the dragnet is an increasing step 
function. Let the interval between the {d + i — 1)*^* and {d + i)^* steps span the i^^ interval, i > 1. 
It's length is denoted by k. 

If the sites in the dragnet Dj support the entire alphabet S, the site j has full block. The site 
where the first full block can materialize is the first jump, the next step after that is second jump 
etc. 



8 



Suppose that we take the particular choice /(n) = n^. Then we have exphcitly 

Proposition 2.2.: Given an alphabet of size d, the location of the i*^ jump is at {d + i)^ + 1, i = 
1,2,... The i*^ interval {{d + i - if + 1, . . . , {d + if} is of length k = 2{d + 

The proof is simple calculation and we omit it. With this choice of jumps the first dragnet of size d 
materializes at j = rf^ + 1, the beginning of the first interval, which is also the first jump site. The 
first interval ends at {d+lf. With the Algorithm one can always generate at least sequences of 
length d^, but after that i.e. at the beginning of the first interval there is the possibility that the 
sites in the dragnet generate a full block which terminates the generation. 

2.2. A probabilistic model 

Wc know proceed to set-up and investigate a simplified model for the halting of the Algorithm. 

To state our first result we first recall that for non-negative integers ki the multinomial coef- 
ficient is defined by the expression 



where 0! = 1 if need be. If = 2 this is just the Binomial coefficient. The following basic property 
of multinomials will be needed subsequently: 

The Multinomied Theorem: For n > d 



where the sum is d-fold over the non-negative integers fcj. 

The proof amounts to inductively rewriting {'^f 1)" in a more explicit form. It can be found in 
many combinatorics and statistics books (see e.g. [A]). 

Theorem 2.3.: Assume that all the symbols on {1, 2, ... ,j — 1} have been laid out independently 



ki+k2-\ \-kd 

ki k2 ... kd 




(2.1) 




and uniformly from S i.e. they are distributed ^ 
the first full block at j in the i^^ interval. Then 



1 1 



1 

• ' d 



) . Let Bj be the event that one has 



(2.2) 




1 



kr>l, r=l,...,d 
ki+--- + ka=cl+i-l 





where the sum is d-fold over the given positive integers. 
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Proof: For j in the i*^ interval the dragnet is of cardinality d + i — 1. Since the entries on the 

support of the dragnet (ji, . . . , j^+i-i) arc independent, each elementary event {sji, ■ ■ ■ , Sj^^^_^) 
is of probability c?^('^+*^^'. A full block at j materializes if and only if each symbol of S appears 
at least once in this dragnet. Since the order of the symbols is immaterial the multiplicity of the 
elementary event is given by the multinomial coefficient 



d + i-1 
ki k2 ... kd 



with fcr > 1 for all r = 1, . . . , d. 



All these arrangements on the dragnet are disjoint, hence the sum (2.2) gives the total probability 
of the full block. I 

Remeirk: The independence assumption here could of course be a heavy one. For one thing identical 
subsequent symbols (sj = Sj+i) are forbidden in the original model but not here. For large S one 
expects exclusions like this to be less significant i.e. our simplified model should then be more 
accurate. We return to this in Section 2.2. after the data comparison. 

The sequence {pi} is crucial in understanding our model. It is of course non-decreasing since the 
dragnet is not shrinking with increasing i. Indeed one can easily show that Pi is strictly increasing 
and Pi 1 1- More specifically 

Proposition 2.4.: For an alphabet S of size d one has for all i>l 



(2.3) '-P^<'['--d) ['--d 

Proof: Geometrically the Multinomial Theorem (2.1) is a statement about the total sum on the 
entries on the n'''^ level of Pascal's d-simplex i.e. on the simplex face perpendicular to (1, 1, . . . , 1) G 
N'' at (lattice) distance n from the top of the pyramid (which is at the origin) . For the size of 1 — pi 
we need by Theorem 2.3 to estimate the boundary sum on that face: 

/ d + i-1 \ 
^ \kik2...kd) 

kr = tor some r ^ ^ " 
ki + --- + kj^ = d+i-l 

Without loss of generality assume that at least kd = 0. Then by the Multinomial Theorem 

' d+i-1 ' 
^ki k2... kd-i. 



f^lH hfcjj_i = £i+i-l 



But rf-simplex has d + 1 sides and its {d + i — ly* level has d sides. For each of them (2.4) holds, 
hence for all boundary terms we have the bound 



kr=0 for some r 
ki + --- + k^=d+i-l 



which is a strict inequality since corners are accounted multiple times and the multinomial is at least 
1. But dividing (2.5) by d^+'-'^, the total sum over the {d + i — 1)^' level, gives the result. I 
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Remark: Although this is not subsequently used, it is interesting to note that in spite of the slight 
slack in (2.5), there seems to be no asymptotic gap in (2.3) (numerical check). 



Theorem 2.5.: Let the assumption on the sequence be as in Theorem 2.3. and let f{n) = ri^ . 
Then the sequence generation halts at j i.e. the distribution of the full block is given by 



(2.6) P (halts at j) 



'0, ^ l<j<d^• 

(1 — piy~'^ "^Pi, j in the first interval; 

(nl=\ (1 - PkY") (1 - Pi)''"'^'"'"^^=i ^"Pi, j in the i'^ interval, i > 2. 



Proof: Before j = cf+l the dragnet is too small to support all of S hence full block is impossible. On 
the first interval the distribution is geometric: with probability 1—pi no block and with probability 
Pi a block and thereby halting. At each instant the termination is independent of the past. 

At times j = {d + i)^ + 1, i = 2,3, . . . the parameter changes to from p; to pi+i but otherwise 
the termination mechanism is still geometric. Compounding this to account the later intervals gives 
the general case in the expression (2.6). I 

CoroUciry 2.6.: The halting time distribution has an exponential tail. The sequences generated 
are almost surely finite. 

Proof: Within the i^^ interval the halting probability (2.6) is geometrically decreasing with the rate 
1 — Pi. Between the i^^ and {i + ly^ intervals the probability jumps, 

P(halts at jd + if + 1) ^ {l-Pi)Pi+i ^ 
^ ■ ' P(halts at {d + if) pi ' 

For small i this expression can and will exceed 1. But by Proposition 2.4. pi ^ 1 and hence also 

— >■ 1 so max {ri,l — Pi} will eventually remain below 1 — e for some e > which implies the 
result for i. The lengths k satisfy Z^+i = li + 2, so in particular Zj+i < (1 + S)li for some 5 > and 
the exponential decay holds in ) as well. 

If Aj ~ {lj I sequences of length > j} then by the above P (Aj) < e^^-' for some positive c. 
Hence P {Aj) is summable and by the Borel-Cantelli Lemma P {Aj i.o.) = 0. I 

Remeirk: All the results above before Theorem 2.5. were independent of the particular choice 
f{n) = n^. In this theorem the assumption was made just to have an explicit (but still complicated) 
formula. In the Corollary one needs to control the interval lengths well enough, a task difficult only 
for rather wildly behaving /. 
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2.3. Model versus data 



To investigate the properties of the sequences generated by the Algorithm we coUected samples for 
alphabets of sizes d = 4. 5, 7, 10, 15 and 20. The upper limit arises simply from the computational 
effort. If we set M — 2000, a reasonable guess (after some playing around) for an upper bound 
for termination for = 20, the sequences must be sieved out from 20^'^'^'^ candidates using the 
Algorithm. Computed with Mathematica on a decent workstation these runs are timed in days. 
The sample sizes (number of sequences generated) that the following observations are based are 
indicated in Table I, last column. 

On the other hand to assess the probabilistic model the main hurdle is to compute explicit 
values for pi for sufficiently many z = 1, 2, 3, . . . since these parametrize the halting distribution in 
(2.6). Here one can expedite things by computing the multinomials in the most efficient way. The 
key sum in (2.2), 

y~\ / d+i — 1 



kr>l, r=l,...,d 



can be simplified. E.g. for i = 4 it can be computed as 

1 J \1 ... 14y V2/V1 ••• 123y V3/Vl ••• 1222 

an expression far easier to deal with than the original ( . . . denotes here a constant string of I's). 
The idea here is to keep record of the multiplicity of the string that appears inside the multinomial 
(like 23 in the middle expression - it can appear in different ways and in 2 different orders). 
But this too gets rather unwieldy for large d with i beyond the smallest values. 

Figure 1., middle row, is joint plot of the sequence termination instant distributions of the data 
and the model for d = 10. One sees a reasonable overall match but also distinct discrepancies 
due to our simplified model. Both plots peak at the same spot, the start of the fifth interval 
{i = 5 so {d + i ~ 1)"^ + 1 = 197) and their levels there are fairly close to each other (0.0140179 and 
0.0150393, empirical and model resp.). The model favours slightly earlier halting as witnessed in 
the distributions: it both rises and sets earlier. 

The model graph has by definition geometric slopes on the intervals and so indeed does the data: 
this is clear from the log-plot on the right of the same data (plot is ln(number of samples + 1)). The 
experimental and model slope tilts are actually fairly close to each other as seen on the left, especially 
in the middle part of the distribution. The key observation here though is that the tail of the data 
distribution seems indeed majored by an exponential and tapers off around same range of j as it 
does for the model. This strongly suggests that the sequences don't get very far. The top row of 
Figure 1 present the same comparison for d ~ 5. The overall distribution match is actually quite 
remarkable. In the finer details like the slope roughness there is deviation from the model (although 
the sample size is here at its largest, see Table 1). We return to this in Section 2.3. 
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Figure 1: Empirical (blue/rough) and theoretical (red/continuous) halting distributions superim- 
posed together (left) and corresponding log empirical distribution (right) for different alphabet sizes. 
From top to bottom row: d = 5, 10 and 15. 

The left plots of the bottom row (for d = 15) exhibits increasing symmetry in the distributions. It 
is also hinting at the high d-limit: with suitable scaling the halting distribution might be normal. 
Magnifying the slopes here shows that the mechanism that gave the top row data the "roughness" 
are not there anymore. One is tempted to think that this is due to randomness smoothing them 
out. 

Perhaps a bit troubling facts in the mid and bottom plots on the left in Figure 1 are the vertical 
off-set between the slopes. There is a way to see a dependency mechanism in the sequences that 
contributes to them appearing. While this does not seem to provide a way to improve the model it 
is still worth understanding. 

Consider the assignment sequence indicated in Figure 2. Here a,b,c and d are coordinates, 
b = a + , c = a + , d = b + n'^, n > k. Since Xa — s we know that .Xf, s ^ as indicated. 
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One should think of a dragnet being at d, hence b and c belong to it and a may or may not belong. 
The simplest case (and probably the one contributing most to the probabilities involved) is of course 
k = 1, n = 2. We are interested in the probability of a full block appearing at d. 




Figure 2: A dependency mechanism affecting the termination probability. 

Ignoring all other symbols on the interval {a,...,d} two cases should be distinguished: 

1. if fc^ + is not a square then no symbol from the triple Xa, xi, and Xc forbids Xd = s but Xb 
and Xc do block one or two non-s symbols at d. 

2. if + is a square then Xa forbids Xci = s and furthermore Xb and Xc block one or two non-s 
symbols at d. 

On the basis of these we see immediately that 

P (full block at I A;^ + not square) < P (full block at rf) < P (full block at d | k^ + square) . 

But as we vary k and n there are many more sites a where 1. instead of 2. holds. Hence one should 
expect the full block i.e. termination probabilities be majored by the independent probabilities of 
the model. This is indeed the case as seen in Figure 1. How to quantify this is another matter 
though, well worth further study. 

To further characterize the sampled sequences we computed their expected lengths and standard 
deviations. These and the corresponding model data are presented in Table I for a few d values. For 
large d we already mentioned of the problem of computing pi for high i. These are needed for an 
accurate tail estimate of the distribution. The asterisks in the Table are due to this complication. 

The values of the two statistical indicators here just quantify aspects of the earlier distributional 
matching. A bit surprisingly the best match occurs at the value d = 5 indicating that small alphabets 
don't seem to suffer from the independence assumption in the model. 

The decay rate of the density in Theorem 1.9. can be used to estimate termination times. 
These turn out truly staggering compared to the real ones above. This is of course due to the proof 
methods and suggests that other arguments truer to the nature of the problem should be found. 
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Symbols 
a 


Empirical 
mean 


Empirical 
std. dev. 


Model 
mean 


Model 
std. dev. 


Sequences 


4 


27.2542 


5.13374 


23.992 


5.23924 


50 ■ 10^ 


5 


39.5672 


8.28983 


39.2172 


8.22516 


80 ■ lO^ 


/y 
D 


dU.o247 


13.5813 


59.3666 


11.9 ( 13 


on 1 a6 
80 ■ lU 


7 


89.4687 


18.5912 


84.982 


16.5113 


30 ■ 10^ 


10 


209.315 


38.2887 


199.562 


35.1369 


20 • 10^ 


15 


566.87 


92.2796 


543.291 


84.4349 


10 • 10^ 


20 


1156.57 


170.829 


* 


* 


5 ■ 10^ 



Table I. Some empirical and theoretical statistics of the one-sided sequences. Means and standard 
deviations of the halting times at various alphabet sizes. Asterisks refer to missing coefficients. 

Analyzing the empirical data further reveals rather clearly certain exponents at play: the values in 
the second column grow almost exactly like d^/"^ and those of the third column roughly like d^^^"^ . 
Since there seems to be an increasingly random behavior in the halting of the sequences for higher 
d, we venture to 

Conjecture 2.7.: Suppose T^'^^ is the halting instant of the Algorithm v2.0 with f{n) = n^. For 
sufEciently rapidly growing M{d) there are positive constants a and b such that as d — >■ oo 

<x] — > $(x) Vx e R 



where $ is the cumulative distribution function of the standard normal N{0, 1). 



Remeirks: 1. For the sequence length M{d) we just need a rate that outgrows the off-set rate d^^^. 
Clearly for the left threshold (rf^ + 1 - ad^/'^)/hd^^/'' -)■ -oo as desired. 

2. A partial result would be to show the Central Limit Theorem for the model. For this one would 
need to analyze the asymptotics oipi{d) and then the scaling limit of (2.6). 

3. The CLT for the algorithm would of course imply the a.s. emptiness of the sequence spaces. 



2.4. Some fine detail 



While generating the sequences some additional information was gleaned, too. To get some insight 
in how the haltings come about we recorded the pairs {i,n) at which the first terminal block was 
generated. With this one knows that while the sequence may be extended a bit (at least one step) 
it cannot be continued past i + nP. Note that because of this the resulting distribution for i is not 
exactly the distribution of the halting instant j earlier, but slightly shifted to the left. 
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Algorithm v2.1: 

0. set M > 1, let Sj = S' at each j e {1, . . . , M} and set i = 1. 

1. pick uniformly a random symbol s € Si. 

2 . update Sj ^ Sj \ {s} for all j = i + f{n) G {i + 1, . . . , M}, n £ N 

if Sj = results for some j , record (i,n), halt and call empty, 
if i = M halt and call full length, 
else i i + 1 and go to 1 . 

With this modified algorithm one gets to sec from where the terminal jumps originate. Figure 3. 
illustrates their distributions, call it the (i,n)-plot, for two d-values. 



1 50 100 150 200 1 50 100 150 200 




1 50 100 150 200 1 50 100 150 200 



Figure 3: Termination distributions for d = 5 (top) and 15 (bottom). Iterate i runs rightwards 
and jump size n downwards. Same distributions in each row; on the left a log-plot, on the right 
the plain support (non-zero entries). Data from 20 and 10 million samples respectively. Note the 
different scales up and down. 

The first distinct feature, the left hand "staircase" in the plots, can be readily explained. 

Proposition 2.8.: The staircase jumps in the {i,n)-plot are at cP + 2{n — l)d — 2(n — 1), n > 1 
and the step length is 2{d — 1). 

Proof: If a terminal block is generated with a jump of size n^, then the previous d—1 (non terminal) 
blocks into this future site have been generated with jumps of size at least (n + l)'^, (n-|-2)^, . . . , {n + 
d—l)"^. These particular jump sizes give the minimal i for which i + v? results in a termination. So 

necessarily j = i + > {n + d — 1)^ + 1, from which wc get that i> {n + d — 1)^ — + 1 = • • • = 
+ 2((n— l)d — n+ 1). Evaluating this for n > 1 gives the step values and their difference gives 
the step size. I 

In a similar fashion to Definition 2.1. one can formulate locations and length of the horizontal {I x 1)- 
blocks inside the distributions at any given level n. Within these the distribution of terminations 
seems to be approximately geometric as expected from our earlier analysis. 
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A couple of things in these plots axe far more intriguing though. The first is the internal and 
right end structure in the plots for d = 5. There seem to be some hard combinatorial constraints at 
play here. Hence the lace-like interior and the abrupt wall on the right (at 173). Note that the plots 
arc based on a large data set of 20 • 10^ samples. Furthermore, for the computation the sequence 
length attempted was M = 600 i.e. the shown graphics is a cutout without any boundary effect 
from 200. Also a blank i.e. no sample points is rather improbable event for low n in the interior of 
the plot unless there is a definite reason for the absence. 

The lower plots clearly indicate random phenomena dominating for higher a alphabet size (here 
d = 15). Apart from the left staircase there are no traces of hard constraints elsewhere. The right 
hand side of the distribution is tapering off like the halting distribution earlier, most likely with 
exponential tails at all n- levels. 

3. Conclusions 

Here we have tried to chart of what still seems unexplored or at least rather hazy terrain. This of 
course applies to much of the study on long range order e.g. in Statistical Mechanics and related 
fields. To come up with an empty set is perhaps a bit disappointing but hopefully the methods 
here are useful in further work. At least we know how stringent even the most innocuous looking 
exclusion rules are if applied with unbounded reach (once the obvious case of periodicity has been 
ruled oTit). Moreover the large discrepancy between the termination times arising from proofs and 
the simple probabilistic model suggests that there should be a more natural proof to be found. 

In the latter half we have restricted ourselves mostly for the jump sequence /(n) = n^. The 
reason is two- fold: firstly, the original question was about this and secondly other nontrivial /'s get 
quickly out of hand computationally. Our probabilistic model should work verbatim for n*', k > 3 
but because of their jump sizes, computing the exact values of pt for high i remains a problem. 
So does generating data: the range M in our algorithms has to be much larger hence a far bigger 
number of exclusions have to be recorded. The reason why at high d a sequence generated eventually 
terminates remains the same, the bulk accumulation of exclusions at a site (which is overcome only 
beyond the lacunar threshold). But a much larger store of excluded symbol values need to be kept 
in order to finally record a full block. Conjecture 2.7. is likely to be true more generally but because 
of lack of supporting statistical data we will not venture to guess it. 
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